Enhanced graph based approach for multi document summarization

نویسندگان

  • Shanmugasundaram Hariharan
  • Thirunavukkarasu Ramkumar
  • Rengaramanujam Srinivasan
چکیده

Summarizing documents catering the needs of an user is tricky and challenging. Though there are varieties of approaches, graphical methods have been quite popularly investigated for summarizing document contents. This paper focus its attention on two graphical methods namely-LexRank (threshold) and LexRank (Continuous) proposed by Erkan and Radev. This paper proposes two enhancements to the above work investigated earlier by adding two more features to the existing one. Firstly, discounting approach was introduced to form a summary which ensures less redundancy among sentences. Secondly, position weight mechanism has been adopted to preserve importance based on the position they occupy. Intrinsic evaluation has been done with two data sets. Data set 1 has been created manually from the news paper documents collected by us for experiments. Data set 2 is from DUC 2002 data which is commercially available and distributed or accessed through National Institute of Standards Technology (NIST). We have shown that the based upon precision and recall parameters were comprehensively better as compared to the earlier algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query-focused Multi-Document Summarization: Combining a Topic Model with Graph-based Semi-supervised Learning

Graph-based learning algorithms have been shown to be an effective approach for query-focused multi-document summarization (MDS). In this paper, we extend the standard graph ranking algorithm by proposing a two-layer (i.e. sentence layer and topic layer) graph-based semi-supervised learning approach based on topic modeling techniques. Experimental results on TAC datasets show that by considerin...

متن کامل

A Graph-based Approach to Cross-language Multi-document Summarization

Cross-language summarization is the task of generating a summary in a language different from the language of the source documents. In this paper, we propose a graph-based approach to multi-document summarization that integrates machine translation quality scores in the sentence extraction process. We evaluate our method on a manually translated subset of the DUC 2004 evaluation campaign. Resul...

متن کامل

Improved Affinity Graph Based Multi-Document Summarization

This paper describes an affinity graph based approach to multi-document summarization. We incorporate a diffusion process to acquire semantic relationships between sentences, and then compute information richness of sentences by a graph rank algorithm on differentiated intra-document links and inter-document links between sentences. A greedy algorithm is employed to impose diversity penalty on ...

متن کامل

Automatic Multi Document Summarization Approaches

Problem statement: Text summarization can be of different nature ranging from indicative summary that identifies the topics of the document to informative summary which is meant to represent the concise description of the original document, providing an idea of what the whole content of document is all about. Approach: Single document summary seems to capture both the information well but it ha...

متن کامل

Building Document Graphs for Multiple News Articles Summarization: An Event-Based Approach

Since most of news articles report several events and these events are referred in many related documents, we propose an event-based approach to visualize documents as graph on different conceptual granularities. With graphbased ranking algorithm, we illustrate the application of document graph to multi-document summarization. Experiments on DUC data indicate that our approach is competitive wi...

متن کامل

Graph-Based Methods for Multi-document Summarization: Exploring Relationship Maps, Complex Networks and Discourse Information

In this work we investigate the use of graphs for multi-document summarization. We adapt the traditional Relationship Map approach to the multidocument scenario and, in a hybrid approach, we consider adding CST (Crossdocument Structure Theory) relations to this adapted model. We also investigate some measures derived from graphs and complex networks for sentence selection. We show that the supe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2013